DOCUMENT RESUME 



ED 044 433 TM 000 148 



AUTHOR 

TITLE 


Hambleton, Ronald K. ; Traub, Ross E. 

Information Curves and Efficiency of Three Logistic 
Test Models. 


INSTITUTION 
PUB DATE 
NOTE 


Massachusetts Univ., Amherst. School of Education. 
Mar 70 

19p. ; Paper presented at the annual meeting of the 
American Psychological Association, Miami, Florida, 
March 1970 


EDRS PRICE 
DESCRIPTORS 


EDRS Price MF-J0.25 HC-$1.05 

♦Ability Identification, Data Collection, ♦Models, 
♦Scoring, Simulation, Test Validity 


ABSTRACT 


The purpose of this study was to determine the 



efficienc) of the estimates of ability provided by the one-parameter 
logistic model as compared to the estimates provided by the more 
general two- and three- parameter models. Several tests were simulated 
with item parameters meeting the assumptions of either the tvo- or 
three-parameter model. For each test, the information provided by 
ability estimates appropriate to the one-, two* and three-parameter 
models was compared at several ability levels. The results indicate 
that it is particularly important, when guessing affects test scores, 
to use the scoring system of the three-parameter model for estimating 
the ability of low-ability examinees. (Author) 



ED 0 4 U 33 



Technical 



A *WELFAK( 

rMJs documVsi^ 

o a ga^za no iSl *o ^'SONMO « 

e*T (0 * Pos-r,oJ ooSf/A 1 ^ or l0y 




CENTER 
FOR 

EDUCATIONAL 

RESEARCH 



University of Massachusetts 



Aikerst 



Net N he *11*4 

Mratnlo */ £ 

ERIC 



(ZW 1 r 0Q3 



INFORMATION CURVES AND EFFICIENCY OF 
THREE LOOISTIC TF.ST MODELS 1 



Ronald K. h’aableton 
University of Massachusetts 

and 

Robb E. Tra^b 

The Ontario Infttitute for Studies in Education 



l Paper presented at the annual meeting of the Aserican Psychological 
Association, Kia&i, 1970. 




INFORMATION CURVES AND EFFICIENCY OF 
THREE LOOISTIC TEST MODELS 



Ronald K. Hambleton 
University of Massachusetts 

and 

Rose S, Traub 

The Ontario Institute for Studies in Kduoaticn 



One vay of evaluating a latent trait model for teste is in 
terms of the precision with vhich it estimates an examinee's ability: 

The more precise the estimate, the more information the model can be 
said to provide. Birnbaum (1966) operationalised this conception of 
Information as the quantity Inversely proportional to the squared length 
of the confidence interval for the estimate of an examinee's ability. 
Defined in this vay, the amount of information in a test is a function 
of ability. Mathematically, Birnbaum' s information function may be 
defined as 
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In equation (1), £ is the amount of the information at. ability level 

£ provided by scoring formula, x , vhere 
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n is the number of items in the test, v is the scoring velght for 

item * , and u is e function vhich takes the value one if item * 
— g 

is ansvered correctly, a d tero cthervise. The remaining terms of 
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equation (l) are defined as follows: 
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Q (0) ■ 1 - P_(0> , 

w D 



(U) 



and 




' (5) 



P (0) Is the characteristic curve for i,em « with its mathematical 

_g - **• 

fora specified by the test model*, it gives the probability thet an 
examinee of ability 0_ answers item £, correctly. In the three-parameter 
logistic model, (Birnbaum, 1968), the item characteristic curve takes 
the fora presented in equation (3). The parameters b and a are 

6 o 

usually referred to, respectively, as the index of difficulty and 

discrimination of item & , while parameter c^ , the lower asymptote 

© 

of the item characteristic curve, may be thought of as the guessing 

parameter. The constant D is a scaling factor that io usually chosen 

to be 1.7 to make the logistic distribution function conform as closely 

as possible to the normal (Lord, 1952) A two-parameter logistic model 

(Birnbaum, 1957 » 1958a; 1958b; 1968) may be obtained from the three-parameter 

model by assuming that the effect of guessing on test scores is negligible 

and setting c in equation (i) to eero. If, in addition, it is assumed 

that the items in a test have equal discriminating power (i.e., a • a 

© 

for all & , g * 1, 2, ...» n) the resulting item characteristic curve 
has but one free parameter per item (i.e. b ) and specifies a model that 
can be shown to be formally equivalent to a test model developed by 
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Rasch (I960; 1966). 

Blrnbaun (1968, p. U5I4 ) demonstrated that the maximum value of 
£(0., *J * represented as 1.(0.) » is given by 



In^general, I (P , x) S 1(0) . Equality holds vhen the scoring weights, 
v , are chosen such that 



except for a possible scaling factor. Thus to maximize the information 

function and consequently minimize the width of the confidence band 

about an ability estimate under the one-, two- and three-parameter 

logistic models, the scoring weights should be chosen to be 1 , Da , 

© 

and Da Y(Da (0 - b ) - log c ) , g ■ 1, 2, . . . , n , respectively. 

8 8 8 8 

Ctn the third weight, Y. is the logistic distribution function.) Notice 
that only in the case of the three-parameter model are the weights 
dependent on ability. The scoring system of the three-parameter model 
has the effect of reducing the weight assigned to correct answers on 
items with a si stable guessing parameter. Moreover, the weight for such 
items is smallest for low ability examinees who are most likely to have 
answered by guessing, and becomes increasingly large as the ability of 
the examinee increases. 

If scoring weights different from the optimal weights specified 
by a test model are used, the information derived by using these 
inappropriate weights to score a test will be less than what is potentially 
available. Birnbaum used the term efficiency to refer to the information 
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lost due to the use of less than optimal scoring weights. The concept of eff- 
iciency may be formally explicated as follows. Assuming a particular test model 
io the true model, let 1^10., x^) and IglO., represent the information 
functions of any two scoring formulas and x^ respectively. Then, the 
ratio I^(£, x^) / IgC®., x^J ia called the relative efficiency (at 0_) 
of x^ to Xg . If the scoring weights used in Xg are such that 
Ig{0) ■ lg(0» Xg] , then the ratio of 1^(0 , x^J / lg{0,) is called 
the efficiency (at 0.) of x^ . Thus, it is possible, using the optiaal 
scoring weights specified by a model, to investigate the relative 
efficiency of the model at estimating ability when a test is known to 
be composed of items that conform to the assumptions of a more general 
model. For example, the one-paraueter logistic (Rasch) model specifies 
unit scoring weights for estimating ability. The efficiency of scores 
based on these weights when the items in a test conform to the assumptions 
of a two-parameter logistic model is given by 
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The efficiency of scores computed from the weights specified by the 
two-parameter logistic model when the item of a test conform to the 
assumptions of the three-parameter model is given by 
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where x 0 ■ £ Da u , (12) 

* g»l 8 8 

and P(0) is defined as in equation (3). 

The question of efficiency has been considered in at least two 
previous studies. Birnbaum (1968) Investigated the efficiency of unit 
scoring weights when the weights specified by the two-parameter model 
were Appropriate. He did this for abilities in the range -35013 
while systematically varying the range of the distribution of discrimina- 
tion parameters. Birnbaum considered some testa in which the discrimina- 
tion parameters of the items were located half at one end of the range 
of the distribution of discrimination parameters, half at the other end. 
The items in Blrnbaua's tests were all of middle difficulty, that is 
b * 0, g * 1 , 2, . .., n . When there was a small difference between 
the two possible values of the discrimination index (O.UV v_e . 0.58), 
efficiency was about 975. When the values of the discrimination index 
were 0.31 or 0.75, efficiency was reduced, and varied from about 605 to 
about 905 depending on the level of ability. When the two values of the 
discrimination parameter were made to approximate the maximum difference 
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that is observed in practice (0.20 vs. 0.98), efficiency varied from 
about 60$ to about 70$, again depending on ability. Birnbaua also 
considered the more typical case in which the items of a test have 
discrimination parameters distributed more or less uniformly across 
the range 0.20 to 0.98. In thic case, efficiency was about 803C , 

Using a scoring system with an efficiency of 80$ is equivalent to . 
discarding 1/5 of the information available in the test. Clearly, in 
such Instances it would be inefficient to use unweighted test scores- 

Lord (1968) investigated the efficiency of ability estimates based 

on unit scoring weights when optimal estimates would be based on the weights 
specified by the three-parameter logistic model. He found that the 
efficiency of unit-weight scores on the verbal part of the scholastic 
aptitude test where it was assumed that the three-parameter model was the 
true model varied from 55$ at the lowest ability level to a m a xi m um of 90$ 
at a high ability level. Here again, the importance was demonstrated of 
using scoring weights' appropriate to a more general test model. 

Purpose 

Pecently, there hes been increased Interest In logistic test 
models, particularly the one-parameter logistic (Rasch) model. Because 
the restrictive assumptions of +he one-parameter model are often violated 
by test data (see Hacbleton (1969) for a sumary of the evidence) the 
model will usually not fit data as well as the more general logistic 
models. Hence, using the one-parameter model tc estimate ability when 
a more general model would provide a more appropriate estimate will 
result in a loss of information In the sense defined earlier. 
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The questions asked in this study were as follovs: Hov much 

information is obtained about an examinee's ability using the scoring 
systems of the one-, tvo- and three-parameter logistic test models 
as the range of the distribution of item discrimination parameters and 
the mean level of guessing on the items are varied systematically in 
simulated tests? Under these circumstances , what is the efficiency of 
the* scoring systems of the less appropriate one- and tvo-paramete; 
models vhen the comparative standard is the amount of information provided 
by the more appropriate two- and three-parameter models? Since informa- 
tion curves and efficiency are both a function of ability, answers to 
the tvo questions were obtained for different values of A . 

Methodology 

Oeneration of Item Parameters 

To begin with, it was assumed that only a single latent ability 
was being measured. This is an assumption typically made in latent trait 
theory (McDonald, 1967) • The situation which was envisioned as being 
in some sense typical of nature was one in which scores on this single 
latent ability are normally distributed in the population. A suitable 
scaling of the ability continuum would establish a mean of the ability 
distribution of tero and a standard deviation of one. Under thes^ 
conditions, over 99 % of the population would have ability scores on the 
interval (-1, 3). These limits for the range of ability were chosen for 
the study. 

Tests were simulated so that the items ranged in difficulty 
within reasonable limits for the group being tested. In effect, it was 
assumed the test would contain no item so easy that more than 951 of. a 
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population approximately normally distributed on the interval [—3 * 3] 

would get it correct; also, no item would be so difficult that less 

than of the population would get it correct. Difficulty parameters 

bg, g*l, 2, ...,n, were randomly assigned to each of the items 

in a simulated tost subject to the restriction they were drawn from a 

population distribution of difficulty parameters that was rectangular 

on ^he interval [-2, 2] with a mean of zero. Lord’s (1908) w^rk 

reveals this choice of assumed distribution and range of item difficulty 

parameters to be realistic, at least for the kind of test he studied. 

The item discrimination parameters, a , g = 1, 2, ..., n , 

& 

were assumed to be dravn from a uniform population distribution with a 
mean of 0.59 and a range vhich was systematically varied across simula- 
tions between zero and 0.00, inclusive. The results obtained by Lord 
(i960) and Ross (19 66) support the choice of this form of distribution 
for the discrimination parameters. 

The magnitude of the item guessing parameters, c , g = 1, 2, ..., n , 

€ 

for each set of test data was controlled by the value of £ , where £ 
was the mean of the guessing parameters of the items in a simulated test. 
Assuming five-option multiple-choice tests and a heterogeneous ability 
group, it seemed reasonable also to assume that individual values of 
c and £ would be bounded on the Interval (.00, .20]. Give a specified 

O 

value of £ , the c 'e were generated subject to two constraints: 

O 

n 

(1) c - E c / n 
g-1 6 



(2) c - min {.20 - c, c) - c. 



- c + min {.20 - c, c) , g - 1. 2 



O 
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Procedure 



Four ranges of the distribution of discrimination parameters 
were considered: 0.00, 0.20, 0.1<0 and 0.80. Three mean levels of 

the guessing parameter were considered: 0.00, 0.10 and 0.20. (In the 
case of c ■ 0.00 or c ■ 0.20 , all the values of c were zero or 
0.20, respectively.) Under toe conditions specified above, item 
parameters were generated at random by computer for eleven of the twelve 
possible combinations of the range of distribution of discrimination 
parameters and mean level of guessing. (Excluded was the case where 
the range and c, would be zero.) Each simulated test was assumed to 
have 15 items. 

Taking the three-parameter logistic model to be the true model 
(except when c » 0 , in which case the two-parameter logistic model 
was taken to be the true model), the information provided by scores 
based on the weights of the one-, two- and three-parameter logistic 
models was computed for each of seven values of 6. , 0 = -3 + k , 
k * 0, 1, ...» 6 . The efficiency of the scoring systems specified by 
the less general test models was then determined for each level of ability. 
All the computations were done using a program developed by Hambleton (1970). 



two- and three-parameter logistic models respectively. The quantities 



Results and Discussion 




n 

The notation, x, ■ E u 

g«l 



n 



' ri 

and x, » E Da Y(Da 
g-1 8 8 



I[e, Xj), 1(3, X 2 ), i(e, x 3 ], Eff (e, x 2 ) = ire, x 3 ) / i[e, x 3 ] , 

Eff(6, x 2 ) » 1(6, x 2 ) / 1(6, x 3 ) and Eff(0, x 2 ) •* 1(0, Xj) / 

I[0, x 2 ) for 0 ■ -3 + k, k ■ 0, 1, ..., 



6, are reported in Table 1 for 
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eleven sets of data. When c «• 0, Xg ** x^, hence X (6 , x^J , Effte, Xg) 
and Ef/[0, x^, Xg] are not reported. When the range of the discrimina- 
tion parameters is zero, x 1 = Xg » and 80 iljL* £q ) » Eff [6 , x ^] , and 
Fff [0. x v Xg] are not reported. 

The information values di.played in Table 1 reveal an approximately 
bell-shaped relationship between information and ability. Information 
is -greatest near the middle of the ability distribution and much less at 
the extremes. When guessing occurs (c > 0), less information is provided 
by all three scoring systems, but the decrease is particularly noticeable 
at low ability levelo for scoring formulas x^ and Xg . The relationships 
among the information functions of the scoring systems, under the assumption 
that the three-parameter model is the appropriate one, may be roughly 
summarized by the inequalities. 

x Co » x 3 ] - i[e. x 2 J - i[e, , 

This relationship appears to hold except for the situations involving very 
low levels of ability and c > 0 when, I [6, x^ - [6, Xg] . 

It appears that when guessing is a component in test performance, unit 
scoring weights are better than the weights specified by the two-parameter 
model at estimating the ability of low ability examinees. 

On^ additional comment should be made about information functions. 

It is possible to obtain any shape for the information function that is 
desired by Judicious choice of test items (Birnbaum, 1968). The informa- 
tion functions described here may be considered relevant for at least 
some testing situations because the distributions of item parameters chosen 
to guide the simulation of test data were similar to what has been observed 
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TABLE 1 

Information Curves and Efficiency 



Set 1 



Discrimination Parameters: 


a • .59 


» Range ■ 


.20: 


.49 to .69 . 


Guessing Parameters : 


c «* .00 


, Range *• 


.00. . 




Ability 


Il9,x 1 ] 


i[e,x 2 J 


Ue.x^j 


Eff[8,x 1 Eff[0,x 2 


] Eff[0,x 1 ,x 2 3 


-3*0 


.99 


.99 




.99 


MM 


MM 


-2.0 


1.85 


1.86 




.99 


MM 


MM 


-1.0 


2.63 


2.66 




.99 


MM 


MM 


0.0 


2.82 


2.84 


— 


.99 


mm 


MM 


1.0 


2.43 


2.45 




.99 


MM 


MM 


2.0 


1.74 


1.75 


-- 


.99 


mm 


MM 


3.0 


.99 


1.00 




.99 


mm mm 


— 


Set 2 


Discrimination Parameters: 


a - .59 


, Range » 


.40; 


.39 to .79 . 


Guessing Parameters : 


c a .00 


, Range » 


.00 . 




Ability 


i te »x 1 j 


i[e,x 2 ] 




EfflB.Xj^] Eff[9,x 2 


] Eff[6 I x 1> x 2 J 


-3.0 


.94 


.97 


MM 


.97 


mm 


mm 


-2.0 


1.80 


1.85 


— 


.97 


mm 


MM ' 


-1.0 


2.65 


2.74 


— 


.97 


MM 


MM 


0.0 


2.80 


2.91 


-- 


.96 


MM 


MM 


1.0 


2.32 


2.39 


— ■ 


.97 


MM 


MM 


2.0 


1.64 


1.69 


~ 


.97 


MM 


MM 


3.0 


.95 


.97 


— 


.98 


MM 


— 



Set 3 


Discrimination Parameters: 


a * .59 | Range * .80; .19 to .99 . 


Guessing Parameters : 


c* ■ .00 , Range «* .00 . 


Ability 


l[Q>x x ) 


l[0,x 2 ] 


I[0,x 3 ] Eff[0, Xl ] Eff[e f x 2 ) Eff[0, Xll x 2 ] 


-3.0 


.75 


.87 


.86 


-2.0 


1.55 


1.79 


.87 


-1.0 


2.59 


2.98 


.87 


0.0 


2.73 


3.13 


.87 


1.0 


2.01 


2.29 


.87 


2.0 


1.34 


1.52 


— .88 -- — 


3.0 


.77 


.87 


.87 
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TABLE 1 (Cont'd) 

Set 4 

Discrimination Parameters! a » .59 , Range ■ ,20; .49 to .69 . 

Guessing Parameters : c ■ .10 , Range » .20; .00 to .20 . 



Ability 




If® i x 2 1 


l[0,x 3 


-3.0 


.40 


.39 


.54 


-2.0 


1.06 


1.07 


1.22 


-1*.0 


1.84 


1.86 


1.95 


0.0 


2.19 


2.23 


2.25 


1.0 


2.03 


2.05 


2.05 


2.0 


1.53 


1.54 


1.53 


3.0 


.89 


.90 


.90 



EfftQjXj) Eff[6,x 2 J Ef f ) 



.74 


.73 


1.01 


.87 


.87 


.99 


.94 


.95 


,99 


.97 


.99 


.99 


.99 


1.00 


.99 


.99 


1.00 


.99 


.99 


. 1.00 


.99 



Set 5 



Discrimination Parameters: 
Guessing Parameters : 


a » .59 
c ■ .10 


, Range 

, Range 


■ .40; . 
*> .20; . 


39 to .79 . 
00 to .20 . 


Ability 




l[0,x 2 ] 


I[6,X3) 


Eff[0,xjJ Efr[e,x 2 ] 


Eff [O.Xj.x^J 


-3.0 


.39 


.37 


.53 


.73 


.70 


L04 


-2.0 


1.04 


1.05 


1.21 


.86 


.87 


.99 


-1.0 


1.86 


1.92 


2.02 


.92 


.95 


.97 


0.0 


2.18 


2.27 


2.30; 


.95 


.99 


.96 


1.0 


1.93 


2.00 


2.01 


.96 


1.00 


.96 


2.0 


1.44 


1.48 


1.48 


.97 


1.00 


.97 


3.0 


.85 


.87 


.87 


.98 


1.00 


.98 



Set 6 


Discrimination Parameters: 
Guessing Parameters : 


a ■ .59 
"c ■ .10 


, Range 

. Range 


** .80; . 

- .20; . 


19 to .99 . 
00 to .20 . 


Ability 


He.xj 


l[0»x 2 J 


i[e,x 3 ) 


EfftG.X!) 


Eff{e,x 2 J 


Eff [ 0 , x v x 2 ) 


-3.0 


.34 


.29 


.48 


.70 


.60 


1.16 


-2.0 


.92 


.94 


1.14 


.80 


.82 


.97 


-1.0 


1.83 


2.06 


2.17 


.84 


.95 


.89 


0.0 


2.12 


2.45 


2.48 


.85 


.99 


.86 


1.0 


1.67 


1.92 


1.93 


.86 


1.00 


.87 


2.0 


1.18 


1.34 


1.34 


.88 


1.00 


.88 


3.0 


.69 


.79 


.79 


.87 


1.00 


.87 
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TABLE 1 (Cont'd) 
Set 7 



Discrimination Parameters: 
Guessing Parameters : 


a - .59 
c" » .20 


, Range 

, Range 


- .20; 

* .00 . 


.49 to .69 . 


Ability 




i[e,x 2 ] 


I[6,x 3 ] 


Eff[6,x 1 ] Eff(9,x 2 


] Eff[0,x 1 ,x 2 ] 


-^.0 


.22 


.22 


.32 


.68 


.68 


1.01 


-2.0 


.69 


.69 


.87 


.79 


.80 


.99 


-1.0 


1.33 


1.35 


1.53 


.87 


.89 


.98 


0.0 


1.70 


1.72 


1.82 


.93 


.95 


.98 


1.0 


1.64 


1.66 


1.69 


.97 


.98 


.99 


2.0 


1.28 


1.29 


1.29 


.99 


1.00 


.99 


3.0 


.76 


.77 


.77 


.99 


1.00 


.99 



Set 8 




Discrimination Parameters: 
Guessing Parameters ; 


a ■ .59 
c - .20 


, Range 

, Range 


» .40; . 
- .00 . 


39 to .79 . 


Ability 


1(9, x : J 


i[e,x 2 ] 


l[0,x 3 ) 


Effte.x^ Eff[e,x 2 ] 


Eff {ejXpx^j] 


-3.0 


.22 


.21 


.32 


.68 


.65 


1.05 


-2.0 


.67 


.67 


.85 


.79 


.79 


1.00 


-1.0 


1.35 


1.40 


1.58 


.85 


.89 


.96 


0.0 


1.69 


1.78 


1.88 


.90 


.95 


.95 


1.0 


1.56 


1.62 


1.65 


.95 


.98 


.96 


2.0 


1.20 


1.23 


1.24 


.97 


1.00 


.97 


3.0 


.73 


.74 


.74 


.98 


1.00 


.98 









Set 


: 9 






Discrimination Parameters: 
Guessing Parameters : 


a ■ .59 
c - .20 


, Range 

i Range 


« .80; .19 to .99 . 
- .00 . 


Ability 


1(9, » X 1 


l[0,x 2 ] 


1(0, x 3 ] 


Eff tOtX^ 


Eff(e,x 2 ] 


Eff[e,X]_,x 2 ] 


-3.0 


.19 


.16 


.29 


.66 


.54 


1.23 


-2.0 


.59 


.59 


.78 


.76 


.76 


1.00 


-1.0 


1.33 


1.51 


1.69 


.79 


.89 


.88 


0.0 


1.65 


1.94 


2.06 


.80 


.95 


.85 


1.0 


1.34 


1.56 


1.59 


.85 


,98 


.86 


2.0 


.97 


1.11 


1.11 


.87 


1.00 


.88 


3.0 


.58 


.67 


.67 


.87 


1.00 


.87 
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TABLE 1 (Cont'd) 



Set 10 



Discrimination Parameters: 
Guessing Parameters : 


a ■ .59 
c - .10 


, Range 

, Range 


■ .00 . 

*» .20; .00 to ,20 . 


Ability 


U0,*l) 


i[e,x 2 J 


1(0, x 3 ] 


Efffe.xJ i 


Eff(8,x 2 J 


r^O 


.40 


... 


.55 


• / •* 




-2.0 


1.0$ 


— 


1.21 


.87 





-1.0 


1.80 


— 


1.90 


.95 


— 


0,0 


2.19 




2.22 


.98 


~~ 


1.0 


2.09 




2.10 


1.00 


— 


2.0 


1.58 


— 


1.58 


1.00 





3.0 


.91 


— 


.91 


1.00 


— 







Set 


11 


Discrimination Parameters: 


7 - .59 


, Range » ;0Q . 


Guessing Parameters : 


c ■ .20 


, Range *» *00 * 


Ability 


I(e,x 1 ] 1(8, X 2 ] 


i(e,x 3 ] 


Effje.xj^] Effre,x 2 j Effie.xj^.xg] 


-3.0 


.22 


.33 


.68 


-2.0 


.68 


.87 


.79 — — 


-1.0 


1.30 


1.48 


.88 


0.0 


1.69 


1.79 


.95 — 


1.0 


1.69 


1.72 


.98 — — 


2.0 


1.33 


1.33 


1.00 — 


3.0 


.78 


.78 


1.00 — 
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In some testing applications. 

The results for efficiency may be summarized as follows: Vhen 

there is no guessing (i.e. c ® 0), the efficiency of a scoring system 
vising unit weights remains high (over 95 %) until the range of the distri- 
bution of discrimination parameters becomes large (0.80 in this study). 
Moreover, efficiency is relatively constant across different levels 
of “ability. When guessing is introduced, this picture changes drama- 
tically. Then, at low ability levels the efficiency of scoring systems 
x^ or Xg is markedly reduced, independently of the magnitude of the 
range of the distribution of discrimination parameters. Of course, 
as this range increases, the efficiency of x^ and Xg decreases, 
again most noticeably at the low ability levels. Indeed, even with a 
maximum range of the distribution of discrimination parameters (0.80), 

Xg still provides very efficient estimates of ability for examinees 
with high ability. Under the same circumstances, x^ has considerably 
reduced efficiency. 

On the basis of these results, it appears that when a test is 
being used to estimate ability across a broad range of ability levels and 
when guessing is a factor in test performance, the scoring system of the 
three-paramet ->r model is to be preferred. On the other hand, if only high 
ability lixaminees are of interest, then even in the presence of guessing, 
the scoring system of the two-parameter model provides acceptable ability 
estimates no matter how wide the range of the distribution parameters 
becomes within the limits studied here. Unit scoring weights, that is the 
scoring system of the one-parameter (Rasch) model appears to provide 
efficient estimates of ability when there is little or no guessing and when 
the range of the distribution of discrimination parameters is fairly small. 
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